Preference for response-contingent vs. free reinforcement

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Model-Free Preference-Based Reinforcement Learning

Specifying a numeric reward function for reinforcement learning typically requires a lot of hand-tuning from a human expert. In contrast, preference-based reinforcement learning (PBRL) utilizes only pairwise comparisons between trajectories as a feedback signal, which are often more intuitive to specify. Currently available approaches to PBRL for control problems with continuous state/action sp...

متن کامل

Contingent Features for Reinforcement Learning

Applying reinforcement learning algorithms in real-world domains is challenging because relevant state information is often embedded in a stream of high-dimensional sensor data. This paper describes a novel algorithm for learning task-relevant features through interactions with the environment. The key idea is that a feature is likely to be useful to the degree that its dynamics can be controll...

متن کامل

A Comparison of Noncontingent Plus Contingent Reinforcement to Contingent Reinforcement Alone on Students’ Academic Performance

Noncontingent reinforcement (NCR) can be described as time-based or response-independent delivery of stimuli with known reinforcing properties. Previous research has shown NCR to reduce problem behavior in individuals with developmental disabilities and to interfere with the acquisition of more desired alternative behavior. To date, however, little research has examined the effects of NCR on ch...

متن کامل

Preference-Based Policy Iteration: Leveraging Preference Learning for Reinforcement Learning

This paper makes a first step toward the integration of two subfields of machine learning, namely preference learning and reinforcement learning (RL). An important motivation for a “preference-based” approach to reinforcement learning is a possible extension of the type of feedback an agent may learn from. In particular, while conventional RL methods are essentially confined to deal with numeri...

متن کامل

Preference-based Reinforcement Learning

This paper investigates the problem of policy search based on the only expert’s preferences. Whereas reinforcement learning classically relies on a reward function, or exploits the expert’s demonstrations, preference-based policy learning (PPL) iteratively builds and optimizes a policy return estimate as follows: The learning agent demonstrates a few policies, is informed of the expert’s prefer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Bulletin of the Psychonomic Society

سال: 1977

ISSN: 0090-5054

DOI: 10.3758/bf03329300